skip to main content


Search for: All records

Creators/Authors contains: "Domini, Fulvio"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available July 3, 2024
  2. Depth estimation is fundamental to 3D perception, and humans are known to have biased estimates of depth. This study investigates whether convolutional neural networks (CNNs) can be biased when predicting the sign of curvature and depth of surfaces of textured surfaces under different viewing conditions (field of view) and surface parameters (slant and texture irregularity). This hypothesis is drawn from the idea that texture gradients described by local neighborhoods—a cue identified in human vision literature—are also representable within convolutional neural networks. To this end, we trained both unsupervised and supervised CNN models on the renderings of slanted surfaces with random Polka dot patterns and analyzed their internal latent representations. The results show that the unsupervised models have similar prediction biases as humans across all experiments, while supervised CNN models do not exhibit similar biases. The latent spaces of the unsupervised models can be linearly separated into axes representing field of view and optical slant. For supervised models, this ability varies substantially with model architecture and the kind of supervision (continuous slant vs. sign of slant). Even though this study says nothing of any shared mechanism, these findings suggest that unsupervised CNN models can share similar predictions to the human visual system. Code: github.com/brownvc/Slant-CNN-Biases 
    more » « less
    Free, publicly-accessible full text available August 5, 2024
  3. How the brain derives 3D information from inherently ambiguous visual input remains the fundamental question of human vision. The past two decades of research have addressed this question as a problem of probabilistic inference, the dominant model being maximum-likelihood estimation (MLE). This model assumes that independent depth-cue modules derive noisy but statistically accurate estimates of 3D scene parameters that are combined through a weighted average. Cue weights are adjusted based on the system representation of each module's output variability. Here I demonstrate that the MLE model fails to account for important psychophysical findings and, importantly, misinterprets the just noticeable difference, a hallmark measure of stimulus discriminability, to be an estimate of perceptual uncertainty. I propose a new theory, termed Intrinsic Constraint, which postulates that the visual system does not derive the most probable interpretation of the visual input, but rather, the most stable interpretation amid variations in viewing conditions. This goal is achieved with the Vector Sum model, which represents individual cue estimates as components of a multi-dimensional vector whose norm determines the combined output. This model accounts for the psychophysical findings cited in support of MLE, while predicting existing and new findings that contradict the MLE model. This article is part of a discussion meeting issue ‘New approaches to 3D vision’. 
    more » « less
  4. Motor learning in visuomotor adaptation tasks results from both explicit and implicit processes, each responding differently to an error signal. Although the motor output side of these processes has been extensively studied, the visual input side is relatively unknown. We investigated if and how depth perception affects the computation of error information by explicit and implicit motor learning. Two groups of participants made reaching movements to bring a virtual cursor to a target in the frontoparallel plane. The Delayed group was allowed to reaim and their feedback was delayed to emphasize explicit learning, whereas the camped group received task-irrelevant clamped cursor feedback and continued to aim straight at the target to emphasize implicit adaptation. Both groups played this game in a highly detailed virtual environment (depth condition), leveraging a cover task of playing darts in a virtual tavern, and in an empty environment (no-depth condition). The delayed group showed an increase in error sensitivity under depth relative to no-depth. In contrast, the clamped group adapted to the same degree under both conditions. The movement kinematics of the delayed participants also changed under the depth condition, consistent with the target appearing more distant, unlike the Clamped group. A comparison of the delayed behavioral data with a perceptual task from the same individuals showed that the greater reaiming in the depth condition was consistent with an increase in the scaling of the error distance and size. These findings suggest that explicit and implicit learning processes may rely on different sources of perceptual information. NEW & NOTEWORTHY We leveraged a classic sensorimotor adaptation task to perform a first systematic assessment of the role of perceptual cues in the estimation of an error signal in the 3-D space during motor learning. We crossed two conditions presenting different amounts of depth information, with two manipulations emphasizing explicit and implicit learning processes. Explicit learning responded to the visual conditions, consistent with perceptual reports, whereas implicit learning appeared to be independent of them. 
    more » « less
  5. null (Ed.)
  6. Visually guided movements can show surprising accuracy even when the perceived three-dimensional (3D) shape of the target is distorted. One explanation of this paradox is that an evolutionarily specialized “vision-for-action” system provides accurate shape estimates by relying selectively on stereo information and ignoring less reliable sources of shape information like texture and shading. However, the key support for this hypothesis has come from studies that analyze average behavior across many visuomotor interactions where available sensory feedback reinforces stereo information. The present study, which carefully accounts for the effects of feedback, shows that visuomotor interactions with slanted surfaces are actually planned using the same cue-combination function as slant perception and that apparent dissociations can arise due to two distinct supervised learning processes: sensorimotor adaptation and cue reweighting. In two experiments, we show that when a distorted slant cue biases perception (e.g., surfaces appear flattened by a fixed amount), sensorimotor adaptation rapidly adjusts the planned grip orientation to compensate for this constant error. However, when the distorted slant cue is unreliable, leading to variable errors across a set of objects (i.e., some slants are overestimated, others underestimated), then relative cue weights are gradually adjusted to reduce the misleading effect of the unreliable cue, consistent with previous perceptual studies of cue reweighting. The speed and flexibility of these two forms of learning provide an alternative explanation of why perception and action are sometimes found to be dissociated in experiments where some 3D shape cues are consistent with sensory feedback while others are faulty. NEW & NOTEWORTHY When interacting with three-dimensional (3D) objects, sensory feedback is available that could improve future performance via supervised learning. Here we confirm that natural visuomotor interactions lead to sensorimotor adaptation and cue reweighting, two distinct learning processes uniquely suited to resolve errors caused by biased and noisy 3D shape cues. These findings explain why perception and action are often found to be dissociated in experiments where some cues are consistent with sensory feedback while others are faulty. 
    more » « less
  7. Because the motions of everyday objects obey Newtonian mechanics, perhaps these laws or approximations thereof are internalized by the brain to facilitate motion perception. Shepard’s seminal investigations of this hypothesis demonstrated that the visual system fills in missing information in a manner consistent with kinematic constraints. Here, we show that perception relies on internalized regularities not only when filling in missing information but also when available motion information is inconsistent with the expected outcome of a physical event. When healthy adult participants ( Ns = 11, 11, 12, respectively, in Experiments 1, 2, and 3) viewed 3D billiard-ball collisions demonstrating varying degrees of consistency with Newtonian mechanics, their perceptual judgments of postcollision trajectories were biased toward the Newtonian outcome. These results were consistent with a maximum-likelihood model of sensory integration in which perceived target motion following a collision is a reliability-weighted average of a sensory estimate and an internal prediction consistent with Newtonian mechanics.

     
    more » « less